NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration

Luo, Katie Z; Dao, Minh-Quan; Liu, Zhenzhen; Campbell, Mark; Chao, Wei-Lun; Weinberger, Kilian Q; Malis, Ezio; Fremont, Vincent; Hariharan, Bharath; Shan, Mao; et al (October 2025, IEEE)

Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. However, existing V2X datasets are limited in scope, diversity, and quality. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes collected from three connected autonomous vehicles (CAVs) equipped with two different configurations of LiDAR sensors, plus a roadside unit with dual LiDARs. Our dataset provides point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training. We provide detailed statistical analysis on the quality of our dataset and extensively benchmark existing V2X methods on it. Mixed Signals is ready-to-use, with precise alignment and consistent annotations across time and viewpoints. We hope our work advances research in the emerging, impactful field of V2X perception. Dataset details at https://mixedsignalsdataset.cs.cornell.edu/.
more » « less
Full Text Available
PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation

Gong, Albert; Stankeviciute, Kamile; Wan, Chao; Kabra, Anmol; Thesmar, Raphael; Lee, Johann; Klenke, Julius; Gomes, Carla P; Weinberger, Kilian Q (July 2025, Proceedings of Machine Learning Research)

High-quality benchmarks are essential for evaluating reasoning and retrieval capabilities of large language models (LLMs). However, curating datasets for this purpose is not a permanent solution as they are prone to data leakage and inflated performance results. To address these challenges, we propose PhantomWiki: a pipeline to generate unique and factually consistent document corpora with diverse question-answer pairs. Unlike prior work, PhantomWiki is neither a fixed dataset, nor is it based on any existing data. Instead, a new PhantomWiki instance is generated on demand for each evaluation. We vary the question difficulty and corpus size to disentangle reasoning and retrieval capabilities respectively, and find that PhantomWiki datasets are surprisingly challenging for frontier LLMs. Thus, we contribute a scalable and data leakage-resistant framework for disentangled evaluation of reasoning, retrieval, and tool-use abilities.
more » « less
Full Text Available
ON SPEEDING UP LANGUAGE MODEL EVALUATION

Zhou, Jin Peng; Belardi, Christian K; Wu, Ruihan; Zhang, Travis; Gomes, Carla P; Sun, Wen; Weinberger, Kilian Q (June 2025, International Conference on Learning Representations)

Developing prompt-based methods with Large Language Models (LLMs) requires making numerous decisions, which give rise to a combinatorial search problem over hyper-parameters. This exhaustive evaluation can be time-consuming and costly. In this paper, we propose an adaptive approach to explore this space. We are exploiting the fact that often only few samples are needed to identify clearly superior or inferior settings, and that many evaluation tests are highly correlated. We lean on multi-armed bandits to sequentially identify the next (method, validation sample)-pair to evaluate and utilize low-rank matrix factorization to fill in missing evaluations. We carefully assess the efficacy of our approach on several competitive benchmark problems and show that it can identify the top-performing method using only 5-15% of the typical resources—resulting in 85-95% LLM cost savings. Our code is available at https://github.com/kilian-group/banditeval.
more » « less
Full Text Available
Learning 3D Perception from Others' Predictions

Yoo, Jinsu; Feng, Zhenyang; Pan, Tai-Yu; Sun, Yihong; Phoo, Cheng Perng; Chen, Xiangyu; Campbell, Mark; Weinberger, Kilian Q; Hariharan, Bharath; Chao, Wei-Lun (April 2025, International Conference on Learning Representations)

Accurate 3D object detection in real-world environments requires a huge amount of annotated data with high quality. Acquiring such data is tedious and expensive, and often needs repeated effort when a new sensor is adopted or when the detector is deployed in a new environment. We investigate a new scenario to construct 3D object detectors: learning from the predictions of a nearby unit that is equipped with an accurate detector. For example, when a self-driving car enters a new area, it may learn from other traffic participants whose detectors have been optimized for that area. This setting is label-efficient, sensor-agnostic, and communication-efficient: nearby units only need to share the predictions with the ego agent (e.g., car). Naively using the received predictions as ground-truths to train the detector for the ego car, however, leads to inferior performance. We systematically study the problem and identify viewpoint mismatches and mislocalization (due to synchronization and GPS errors) as the main causes, which unavoidably result in false positives, false negatives, and inaccurate pseudo labels. We propose a distance-based curriculum, first learning from closer units with similar viewpoints and subsequently improving the quality of other units' predictions via self-training. We further demonstrate that an effective pseudo label refinement module can be trained with a handful of annotated data, largely reducing the data quantity necessary to train an object detector. We validate our approach on the recently released real-world collaborative driving dataset, using reference cars' predictions as pseudo labels for the ego car. Extensive experiments including several scenarios (e.g., different sensors, detectors, and domains) demonstrate the effectiveness of our approach toward label-efficient learning of 3D perception from other units' predictions.
more » « less
Full Text Available
Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene

https://doi.org/10.1109/CVPR52734.2025.01123

Pan, Tai-Yu; Jeon, Sooyoung; Fan, Mengdi; Yoo, Jinsu; Feng, Zhenyang; Campbell, Mark; Weinberger, Kilian Q; Hariharan, Bharath; Chao, Wei-Lun (June 2025, IEEE)

Self-driving cars relying solely on ego-centric perception face limitations in sensing, often failing to detect occluded, faraway objects. Collaborative autonomous driving (CAV) seems like a promising direction, but collecting data for development is non-trivial. It requires placing multiple sensor-equipped agents in a real-world driving scene, simultaneously! As such, existing datasets are limited in locations and agents. We introduce a novel surrogate to the rescue, which is to generate realistic perception from different viewpoints in a driving scene, conditioned on a real-world sample—the ego-car’s sensory data. This surrogate has huge potential: it could potentially turn any ego-car dataset into a collaborative driving one to scale up the development of CAV. We present the very first solution, using a combination of simulated collaborative data and real ego-car data. Our method Transfer Your Perspective (TYP) learns a conditioned diffusion model whose output samples are not only realistic but also consistent in both semantics and layouts with the given ego-car data. Empirical results demonstrate TYP’s effectiveness in aiding in a CAV setting. In particular, TYP enables us to (pre-)train collaborative perception algorithms like early and late fusion with little or no real-world collaborative data, greatly facilitating downstream CAV applications.
more » « less
Full Text Available
DiffuBox: Refining 3D Object Detection with Point Diffusion

Chen, Xiangyu; Liu, Zhenzhen; Luo, Katie Z; Datta, Siddhartha; Polavaram, Adhitya; Wang, Yan; You, Yurong; Li, Boyi; Pavone, Marco; Chao, Wei-Lun; et al (December 2024, Advances in Neural Information Processing Systems 37 (NeurIPS 2024))

Ensuring robust 3D object detection and localization is crucial for many applications in robotics and autonomous driving. Recent models, however, face difficulties in maintaining high performance when applied to domains with differing sensor setups or geographic locations, often resulting in poor localization accuracy due to domain shift. To overcome this challenge, we introduce a novel diffusion-based box refinement approach. This method employs a domain-agnostic diffusion model, conditioned on the LiDAR points surrounding a coarse bounding box, to simultaneously refine the box’s location, size, and orientation. We evaluate this approach under various domain adaptation settings, and our results reveal significant improvements across different datasets, object classes and detectors. Our PyTorch implementation is available at https://github.com/cxy1997/DiffuBox.
more » « less
Full Text Available
Learning to Detect Mobile Objects from LiDAR Scans Without Labels

You, Y; Luo, Katie Z; Phoo, Cheng P; Chao, W; Sun, W; Hariharan, B; Campbell, M; Weinberger, Kilian Q (June 2024, Conference on Computer Vision and Pattern Recognition, June 2022)

Full Text Available
Latent Diffusion for Language Generation

Lovelace, Justin; Kishore, Varsha; Wan, Chao; Shekhtman, Eliot; Weinberger, Kilian Q. (December 2023, Advances in neural information processing systems)
Oh, Alice; Naumann, Tristan; Globerson, Amir; Saenko, Kate; Hardt, Moritz; Levine, Sergey (Ed.)
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models. Our code is available at https://github.com/justinlovelace/latent-diffusion-for-language .
more » « less
Full Text Available
Better Monocular 3D Detectors with LiDAR from the Past

https://doi.org/10.1109/ICRA57147.2024.10610444

You, Yurong; Phoo, Cheng Perng; Andres_Diaz-Ruiz, Carlos; Luo, Katie Z; Chao, Wei-Lun; Campbell, Mark; Hariharan, Bharath; Weinberger, Kilian Q (May 2024, IEEE)

Full Text Available
Pre-Training LiDAR-Based 3D Object Detectors Through Colorization

Pan, Tai-Yu; Ma, Chenyang; Chen, Tianle; Phoo, Cheng Perng; Luo, Katie Z; You, Yurong; Campbell, Mark; Weinberger, Kilian Q; Hariharan, Bharath; Chao, Wei-Lun (May 2024, International Conference on Learning Representations)

Full Text Available

« Prev Next »

Search for: All records